Spain Global Economy, Employment and Public Debt 2002-2020

We are going to analyze the evolution of Spain economy from 2002 to 2020, based on macro-economic data, such as:

  • Employment and unemployment rates
  • Public Debt
  • Public Spending
  • PIB

The data was collected from the following sources: INE (Instituto Nacional de Estadística), datosmacro.com, and Ministerio de Hacienda.

Our analysis goes from 2002 to 2020, because the moment where we are analyzing the data is December-2021, so we don't have full 2021 data.

Spain is the country with the highest unemployment rate in Europe, and one of the world highest. We want to analyze this evolution.

Imports and Environment Settings

In [ ]:
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
import plotly
plotly.offline.init_notebook_mode()
from plotly.subplots import make_subplots
import seaborn as sns
sns.set(rc={'figure.figsize':(14,10)}, style='whitegrid')
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')

Reading the Data

We have obtained several datasets, and we have combined them into a single one.

In [ ]:
data = pd.read_csv('../spain_data.csv')
data.head()
Out[ ]:
year public_debt(M€) total_public_spendings total_spending_over_pib spendings_per_capita(€) mean_salary(€) debt_over_pib(%) debt_per_capita(€) total_private_employment total_public_employment PIB_total(M€) PIB_per_capita(€) unemployment_rate(%)
0 2002 384145 289607 38.6 6868 18601 51.2 9184 14270000 2649000 749552 18090 11.61
1 2003 382775 307871 38.4 7161 19385 47.7 8996 14879000 2766000 802266 19010 11.37
2 2004 389888 333736 38.8 7663 20045 45.4 9005 15411000 2877000 859437 20050 10.53
3 2005 393479 356857 38.5 8053 20616 42.4 8941 16461000 2960000 927357 21240 8.71
4 2006 392132 385827 38.4 8577 21168 39.1 8756 17146000 2944000 1003823 22630 8.26

Our dates are in int format. The data values represents the last day of the year, so we can format the dates to represent that day.

In [ ]:
years = []
for year in data['year'].values:
    years.append(str(year) + '-12-31')
data['year'] = years
data['year'] = pd.to_datetime(data['year'])
data.head()
Out[ ]:
year public_debt(M€) total_public_spendings total_spending_over_pib spendings_per_capita(€) mean_salary(€) debt_over_pib(%) debt_per_capita(€) total_private_employment total_public_employment PIB_total(M€) PIB_per_capita(€) unemployment_rate(%)
0 2002-12-31 384145 289607 38.6 6868 18601 51.2 9184 14270000 2649000 749552 18090 11.61
1 2003-12-31 382775 307871 38.4 7161 19385 47.7 8996 14879000 2766000 802266 19010 11.37
2 2004-12-31 389888 333736 38.8 7663 20045 45.4 9005 15411000 2877000 859437 20050 10.53
3 2005-12-31 393479 356857 38.5 8053 20616 42.4 8941 16461000 2960000 927357 21240 8.71
4 2006-12-31 392132 385827 38.4 8577 21168 39.1 8756 17146000 2944000 1003823 22630 8.26

The remaining data is numeric, as we want.

In [ ]:
data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 19 entries, 0 to 18
Data columns (total 13 columns):
 #   Column                    Non-Null Count  Dtype         
---  ------                    --------------  -----         
 0   year                      19 non-null     datetime64[ns]
 1   public_debt(M€)           19 non-null     int64         
 2   total_public_spendings    19 non-null     int64         
 3   total_spending_over_pib   19 non-null     float64       
 4   spendings_per_capita(€)   19 non-null     int64         
 5   mean_salary(€)            19 non-null     int64         
 6   debt_over_pib(%)          19 non-null     float64       
 7   debt_per_capita(€)        19 non-null     int64         
 8   total_private_employment  19 non-null     int64         
 9   total_public_employment   19 non-null     int64         
 10  PIB_total(M€)             19 non-null     int64         
 11  PIB_per_capita(€)         19 non-null     int64         
 12  unemployment_rate(%)      19 non-null     float64       
dtypes: datetime64[ns](1), float64(3), int64(9)
memory usage: 2.1 KB

Correlations

Let's plot correlations between variables

In [ ]:
sns.heatmap(data.corr(), annot=True, cmap='Greens');

We can see that lot of variables have correlations between them. We can notice that the majority of total_private_employment correlations are negative. This is specially striking if we look at the correlation with unemployment_rate. The more total private employment, the less unemployment rate.

The unemployment_rate is highly correlated with total_spending_over_pib, mean_salary —this makes sense since unemployment rate in Spain growths over time in this period, like the mean salary—. debt_over_pib and debt_per_capita is highly correlated with unemployment_rate aswell.

1. Unemployment Rate - Analysis

We want to analyze the unemployment rate, because is the main problem of Spain (if we talk about economics); also the public debt is a problem to this country.

Let's see the evolution of the unemployment between 2002-2020:

In [ ]:
fig = px.line(data, x='year', y=data['unemployment_rate(%)'], 
        title='Spain - Evolution of Unemployment Rate 2002-2020', 
        template='plotly_white', markers=True)
fig.update_layout(height=400, width=900, template='plotly_white',
                  autosize=False, showlegend=False)
fig.show()

From 2008 the unemployment rate growths to about 26% (2013-2014), it starts to goes down until 2020, where starts to growth again (this can be related with COVID-19 pandemic).

As we saw before, the unemployment rate is highly correlated with the spendings over PIB and debt over PIB. Let's plot this variables:

In [ ]:
fig = make_subplots(rows=2, cols=1, shared_xaxes=False, 
                    subplot_titles=('Spain - Evolution of Public Spendings over GDP', 
                                    'Spain - Evolution of Debt over GDP'))
fig.append_trace(go.Scatter(x=data['year'], y=data['total_spending_over_pib'], 
                            mode='lines+markers', name='Public Spendings over GDP', 
                            line=dict(color='#d62728')), row=1, col=1)
fig.append_trace(go.Scatter(x=data['year'], y=data['debt_over_pib(%)'], 
                            mode='lines+markers', name='Debt over GDP (%)', 
                            line=dict(color='#2ca02c')), row=2, col=1)
fig.update_layout(height=900, width=1100, template='plotly_white',
                  autosize=False, showlegend=True)
fig.show()

Scaling the variables to plot them together

We can scale the data to plot the evolution of the three variables.

In [ ]:
data.loc[0, ['unemployment_rate(%)', 'total_spending_over_pib', 'debt_over_pib(%)']]
Out[ ]:
unemployment_rate(%)       11.61
total_spending_over_pib     38.6
debt_over_pib(%)            51.2
Name: 0, dtype: object
In [ ]:
scaled_variables = data[['year', 'unemployment_rate(%)', 'total_spending_over_pib', 'debt_over_pib(%)']]
scaled_variables['unemployment_rate(%)'] = scaled_variables['unemployment_rate(%)'] * 4.41
scaled_variables['total_spending_over_pib'] = scaled_variables['total_spending_over_pib'] * 1.326
scaled_variables[['unemployment_rate(%)', 'total_spending_over_pib', 'debt_over_pib(%)']] = scaled_variables[['unemployment_rate(%)', 'total_spending_over_pib', 'debt_over_pib(%)']]/51.2
In [ ]:
fig = px.line(scaled_variables, x='year', y=scaled_variables.columns[1:], 
        title='Spain - Evolution of Unemployment Rate, Public Spendings and Debt 2002-2020', 
        template='plotly_white', markers=True)
fig.update_layout(height=500, width=1100, template='plotly_white',
                  autosize=False, showlegend=True)
fig.show()

Here we can see that the unemployment rate wants to follow the debt over PIB; total public spendings over PIB changes are not so agressive. However, the shape of this variable moves like the other two, so this can be consequence of cause of the unemployment rate, or at least there exist some relationship between them.

2. GDP (PIB in Spain)

Let's analyze the evolution of GDP in Spain, and the related variables.

In [ ]:
fig = make_subplots(rows=2, cols=1, 
                    subplot_titles=('Spain - Evolution of GDP 2002-2020', 
                                    'Spain - Evolution of Debt over GDP 2002-2020'))
fig.append_trace(go.Scatter(x=data['year'], y=data['PIB_total(M€)'], 
                            mode='lines+markers', name='Total GDP (Millions)', 
                            line=dict(color='orange')), row=1, col=1)
fig.append_trace(go.Scatter(x=data['year'], y=data['debt_over_pib(%)'], 
                            mode='lines+markers', name='Debt over GDP (%)', 
                            line=dict(color='grey')), row=2, col=1)
fig.update_layout(height=700, width=1000, template='plotly_white',
                  autosize=False, showlegend=False)
fig.show()

GROWTH OF BOTH INDICES:

In [ ]:
gdp_growth = -100 + 1121.94 / 749.55 * 100
gdp_debt_growth = -100 + 120 / 51.2 * 100
fig = px.bar(x=['GDP Growth', 'Debt over GDP Growth'], y=[gdp_growth, gdp_debt_growth], 
        title='Spain - GDP and Debt over GDP Growth 2002-2020', 
        template='plotly_white', labels={
                     "x": "",
                     "y": "Growth %"})
fig.update_layout(height=400, width=900, template='plotly_white',
                  autosize=False, showlegend=False)
fig.show()

SCALED VALUES:

In [ ]:
scaled_val = data[['year', 'PIB_total(M€)', 'debt_over_pib(%)']]
scaled_val['PIB_total(M€)'] = scaled_val['PIB_total(M€)'] / 14639.6875
scaled_val[['PIB_total(M€)', 'debt_over_pib(%)']] = scaled_val[['PIB_total(M€)', 'debt_over_pib(%)']] / 52.1
fig = px.bar(scaled_val, x='year', y=scaled_val.columns[1:],template='plotly_white',
        title='Spain - Evolution of GDP and Debt over GDP (scaled values)')
fig.update_layout(height=400, width=1000,
                  autosize=False, showlegend=True)
fig.show()
In [ ]:
scaled_val = data[['year', 'PIB_total(M€)', 'debt_over_pib(%)']]
scaled_val['PIB_total(M€)'] = scaled_val['PIB_total(M€)'] / 14639.6875
scaled_val[['PIB_total(M€)', 'debt_over_pib(%)']] = scaled_val[['PIB_total(M€)', 'debt_over_pib(%)']] / 52.1
fig = px.line(scaled_val, x='year', y=scaled_val.columns[1:],template='plotly_white',
        title='Spain - Evolution of GDP and Debt over GDP (scaled values)')
fig.update_layout(height=400, width=1000,
                  autosize=False, showlegend=True)
fig.show()

The spanish GDP raises about 50% from 2002 to 2020 (from about 0.8B€ to 1.2B€), however the debt over GDP raises from 50% to 120%, about 140%, almost three times the GDP growth. The debt over GDP was 35% in 2007, and raises to 100% in only seven years (2014). In the last year, the debt over GDP raises 20%, from 100% to 120%. This can be related with the total public spendings:

In [ ]:
fig = px.line(data, x='year', y=data['total_public_spendings'], 
        title='Spain - Evolution of Public Spendings 2002-2020', 
        template='plotly_white', markers=True)
fig.update_layout(height=400, width=900, template='plotly_white',
                  autosize=False, showlegend=False)
fig.show()

It's interesting how the public spendings raises like the unemployment rate. This variable is closely related with the GDP growth aswell (obviously), but we can notice that the GDP growths about 50% —as we said before— while public spendings growth was about 100% —the double— in the same time period.

Public and Private Employment

Let's analyze the public and private employment. As we saw, spanish public spendings were doubled in less than 20 years, so makes sense that public employment raises aswell. We can see in the next plot that the same does not happen with private employment:

In [ ]:
fig = make_subplots(rows=2, cols=1, 
                    subplot_titles=('Spain - Evolution of Public Employment 2002-2020', 
                                    'Spain - Evolution of Private Employment 2002-2020'))
fig.append_trace(go.Scatter(x=data['year'], y=data['total_public_employment'], 
                            mode='lines+markers', name='Public Employments', 
                            line=dict(color='red')), row=1, col=1)
fig.append_trace(go.Scatter(x=data['year'], y=data['total_private_employment'], 
                            mode='lines+markers', name='Private Employments', 
                            line=dict(color='blue')), row=2, col=1)
fig.update_layout(height=700, width=900, template='plotly_white',
                  autosize=False, showlegend=False)
fig.show()

GROWTH OF BOTH KIND OF EMPLOYMENTS 2002-2020:

In [ ]:
public_emp_growth = -100 + (3.337 / 2.649) * 100
private_emp_growth = -100 + (15.839 / 14.27) * 100
fig = px.bar(x=['Public Employment Growth', 'Private Employment Growth'], y=[public_emp_growth, private_emp_growth], 
        title='Spain - Public and Private Employment Growth 2002-2020', 
        template='plotly_white', labels={
                     "x": "",
                     "y": "Growth %"})
fig.update_layout(height=400, width=900, template='plotly_white',
                  autosize=False, showlegend=False)
fig.show()

SCALED VALUES:

In [ ]:
scaled_val = data[['year', 'total_private_employment', 'total_public_employment']]
scaled_val['total_private_employment'] = scaled_val['total_private_employment'] / 5.387
scaled_val[['total_public_employment', 'total_private_employment']] = scaled_val[['total_public_employment', 'total_private_employment']] / 2649000
fig = px.bar(scaled_val, x='year', y=scaled_val.columns[1:],template='plotly_white',
        title='Spain - Evolution of Public and Private Employment (scaled values)')
fig.update_layout(height=400, width=900,
                  autosize=False, showlegend=True)
fig.show()
In [ ]:
scaled_val = data[['year', 'total_private_employment', 'total_public_employment']]
scaled_val['total_private_employment'] = scaled_val['total_private_employment'] / 5.387
scaled_val[['total_public_employment', 'total_private_employment']] = scaled_val[['total_public_employment', 'total_private_employment']] / 2649000
fig = px.line(scaled_val, x='year', y=scaled_val.columns[1:],template='plotly_white',
        title='Spain - Evolution of Public and Private Employment (scaled values)')
fig.update_layout(height=400, width=1000,
                  autosize=False, showlegend=True)
fig.show()
In [ ]:
fig.write_html("file.html")

Conclussions

As we can see, maybe one of the more related variables with the high unemployment rate is the debt over GDP and public spendings.

It's difficult analyze this kind of problems only with data, since there exists a lot of external, economic and social factors that can directly affect the unemployment in a country. But it's obviously that Spain did something wrong, because they're leading the unemployment-rate table in Europe.

The public employment growths with the unemployment rate, while private employment does the other way —when private employment growths, unemployment falls. Maybe one of the keys will be focus on private employment and try to waste less money in public spendings/public employment, because the data says that the private employment is more related with economic health than public employment.